• The Beauty of Complex Designs 

      Arnes, Jo Inge; Bongo, Lars Ailo (Chapter; Bokkapittel, 2020-12-08)
      The increasing use of omics data in epidemiology enables many novel study designs, but also introduces challenges for data analysis. We describe the possibilities for systems epidemiological designs in the Norwegian Women and Cancer (NOWAC) study and show how the complexity of NOWAC enables many beautiful new study designs. We discuss the challenges of implementing designs and analyzing data. Finally, ...
    • Cancer detection for white urban Americans 

      Møllersen, Kajsa; Bongo, Lars Ailo; Tafavvoghi, Masoud (Conference object; Konferansebidrag, 2023-06)
      Development, validation and comparison of machine learning methods require access to data, sometimes lots of data. Within health applications, data sharing can be restricted due to patient privacy, and the few publicly available data sets become even more valuable for the machine learning community. One such type of data are H&E whole slide images (WSI), which are stained tumour tissue, used in ...
    • Data-intensive computing infrastructure systems for unmodified biological data analysis pipelines 

      Bongo, Lars Ailo; Pedersen, Edvard; Ernstsen, Martin (Journal article; Tidsskriftartikkel; Peer reviewed, 2015-11-18)
      Biological data analysis is typically implemented using a deep pipeline that combines a wide array of tools and databases. These pipelines must scale to very large datasets, and consequently require parallel and distributed computing. It is therefore important to choose a hardware platform and underlying data management and processing systems well suited for processing large datasets. There are many ...
    • Evaluating the performance of the allreduce collective operation on clusters. Approach and results 

      Bongo, Lars Ailo; Anshus, Otto J.; Bjørndalen, John Markus (Research report; Forskningsrapport, 2004)
      The performance of the collective operations provided by a communication library is important for many applications run on clusters. The communication structure of collective operations can be organized as a tree. Performance can be improved by configuring and mapping the tree to the clusters in use. We describe and demonstrate an approach for evaluating the performance of different configurations ...
    • Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes 

      Park, Christopher Y.; Wong, Aaron K.; Greene, Casey S.; Rowland, Jessica; Guan, Yuanfang; Bongo, Lars Ailo; Burdine, Rebecca D.; Troyanskaya, Olga (Journal article; Tidsskriftartikkel; Peer reviewed, 2013)
      A key challenge in genetics is identifying the functional roles of genes in pathways. Numerous functional genomics techniques (e.g. machine learning) that predict protein function have been developed to address this question. These methods generally build from existing annotations of genes to pathways and thus are often unable to identify additional genes participating in processes that are not ...
    • IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks 

      Wong, Aaron K.; Park, Christopher Y.; Greene, Casey S.; Bongo, Lars Ailo; Guan, Yuanfang; Troyanskaya, Olga (Journal article; Tidsskriftartikkel; Peer reviewed, 2012)
      Integrative multi-species prediction (IMP) is an interactive web server that enables molecular biologists to interpret experimental results and to generate hypotheses in the context of a large cross-organism compendium of functional predictions and networks. The system provides a framework for biologists to analyze their candidate gene sets in the context of functional networks, as they expand or ...
    • Kvik: three-tier data exploration tools for flexible analysis of genomic data in epidemiological studies 

      Fjukstad, Bjørn; Olsen, Karina Standahl; Jareid, Mie; Lund, Eiliv; Bongo, Lars Ailo (Journal article; Tidsskriftartikkel; Peer reviewed, 2015-03-30)
      Kvik is an open-source system that we developed for explorative analysis of functional genomics data from large epidemiological studies. Creating such studies requires a significant amount of time and resources. It is therefore usual to reuse the data from one study for several research projects. Often each project requires implementing new analysis code, integration with specific knowledge bases, ...
    • Kvik: three-tier data exploration tools for flexible analysis of genomic data in epidemiological studies [version 1; peer review: 2 approved with reservations] 

      Fjukstad, Bjørn; Olsen, Karina Standahl; Jareid, Mie; Lund, Eiliv; Bongo, Lars Ailo (Journal article; Tidsskriftartikkel; Peer reviewed, 2015-03-30)
      Kvik is an open-source system that we developed for explorative analysis of functional genomics data from large epidemiological studies. Creating such studies requires a significant amount of time and resources. It is therefore usual to reuse the data from one study for several research projects. Often each project requires implementing new analysis code, integration with specific knowledge bases, ...
    • Lessons Learned Developing and Using a Machine Learning Model to Automatically Transcribe 2.3 Million Handwritten Occupation Codes 

      Pedersen, Bjørn-Richard; Holsbø, Einar; Andersen, Trygve; Shvetsov, Nikita; Ravn, Johan; Sommerseth, Hilde Leikny; Bongo, Lars Ailo (Journal article; Tidsskriftartikkel; Peer reviewed, 2022-01-06)
      Machine learning approaches achieve high accuracy for text recognition and are therefore increasingly used for the transcription of handwritten historical sources. However, using machine learning in production requires a streamlined end-to-end pipeline that scales to the dataset size and a model that achieves high accuracy with few manual transcriptions. The correctness of the model results must ...
    • The Longcut Wide Area Network Emulator. Design and Evaluation 

      Bongo, Lars Ailo (Research report; Forskningsrapport, 2005)
      Experiments run on a Grid, consisting of clusters administered by multiple organizations connected by shared wide area networks (WANs), may not be reproducible. First, traffic on the WAN cannot be controlled. Second, allocating the same resources for subsequent experiments can be difficult. Longcut solves both problems by splitting a single cluster into several parts, and for each part having one ...
    • Low-Cost Programmable Air Quality Sensor Kits in Science Education 

      Fjukstad, Bjørn; Angelvik, Nina; Hauglann, maria wulff; Knutsen, Joachim Sveia; Grønnesby, Morten; Gunhildrud, Hedinn; Bongo, Lars Ailo (Chapter; Bokkapittel, 2018-02-21)
      We describe our citizen science approach and technologies designed to introduce students in upper secondary schools to computational thinking and engineering. Using an Arduino microcontroller and low-cost sensors we have developed the air:bit, a programmable sensor kit that students build and program to collect air quality data. In our course, students develop their own research questions regarding ...
    • The metagenomic data life-cycle: standards and best practices 

      Ten Hoopen, Petra; Finn, Robert D.; Bongo, Lars Ailo; Corre, Erwan; Fosso, Bruno; Meyer, Folker; Mitchell, Alex; Pelletier, Eric; Pesole, Graziano; Santamaria, Monica; Willassen, Nils Peder; Cochrane, Guy (Journal article; Tidsskriftartikkel; Peer reviewed, 2017-08-01)
      Metagenomics data analyses from independent studies can only be compared if the analysis workflows are described in a harmonized way. In this overview, we have mapped the landscape of data standards available for the description of essential steps in metagenomics: (i) material sampling, (ii) material sequencing, (iii) data analysis, and (iv) data archiving and publishing. Taking examples from marine ...
    • Mr. Clean: A Tool for Tracking and Comparing the Lineage of Scientific Visualization Code 

      Tartari, Giacomo; Tiede, Lars; Holsbø, Einar; Knudsen, Kenneth; Raknes, Inge Alexander; Fjukstad, Bjørn; Mode, Nicolle; Bjørndalen, John Markus; Lund, Eiliv; Bongo, Lars Ailo (Conference object; Konferansebidrag, 2014)
    • Norwegian e-Infrastructure for Life Sciences (NeLS) 

      Tekle, Kidane M; Gundersen, Sveinung; Klepper, Kjetil; Bongo, Lars Ailo; Raknes, Inge Alexander; Li, Xiaxi; Zhang, Wei; Andreetta, Christian; Mulugeta, Teshome Dagne; Kalaš, Matúš; Rye, Morten Beck; Hjerde, Erik; Antony Samy, Jeevan Karloss; Fornous, Ghislain; Azab, Abdulrahman; Våge, Dag Inge; Hovig, Eivind; Willassen, Nils Peder; Drabløs, Finn; Nygård, Ståle; Petersen, Kjell; Jonassen, Inge (Journal article; Tidsskriftartikkel; Peer reviewed, 2018-06-29)
      The Norwegian e-Infrastructure for Life Sciences (NeLS) has been developed by ELIXIR Norway to provide its users with a system enabling data storage, sharing, and analysis in a project-oriented fashion. The system is available through easy-to-use web interfaces, including the Galaxy workbench for data analysis and workflow execution. Users confident with a command-line interface and programming may ...
    • A Pragmatic Machine Learning Approach to Quantify Tumor-Infiltrating Lymphocytes in Whole Slide Images 

      Shvetsov, Nikita; Grønnesby, Morten; Pedersen, Edvard; Møllersen, Kajsa; Rasmussen Busund, Lill-Tove; Schwienbacher, Ruth; Bongo, Lars Ailo; Kilvær, Thomas Karsten (Journal article; Tidsskriftartikkel; Peer reviewed, 2022-06-16)
      Increased levels of tumor-infiltrating lymphocytes (TILs) indicate favorable outcomes in many types of cancer. The manual quantification of immune cells is inaccurate and time-consuming for pathologists. Our aim is to leverage a computational solution to automatically quantify TILs in standard diagnostic hematoxylin and eosin-stained sections (H&E slides) from lung cancer patients. Our approach ...
    • Predicting breast cancer metastasis from whole-blood transcriptomic measurements 

      Holsbø, Einar; Perduca, Vittorio; Bongo, Lars Ailo; Lund, Eiliv; Birmelé, Etienne (Journal article; Tidsskriftartikkel; Peer reviewed, 2020-05-20)
      <i>Objective</i> - In this exploratory work we investigate whether blood gene expression measurements predict breast cancer metastasis. Early detection of increased metastatic risk could potentially be life-saving. Our data comes from the Norwegian Women and Cancer epidemiological cohort study. The women who contributed to these data provided a blood sample up to a year before receiving a breast ...
    • Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs 

      Voets, Mike; Møllersen, Kajsa; Bongo, Lars Ailo (Journal article; Tidsskriftartikkel; Peer reviewed, 2019-06-06)
      We have attempted to reproduce the results in <i>Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs</i>, published in JAMA 2016; 316(22), using publicly available data sets. We re-implemented the main method in the original study since the source code is not available. The original study used non-public fundus images from EyePACS ...
    • Social network analysis of Staphylococcus aureus carriage in a general youth population 

      Stensen, Dina Benedicte Berg; Cañadas, Rafael A. Nozal; Småbrekke, Lars; Olsen, Karina; Nielsen, Christopher Sivert; Svendsen, Kristian; Hanssen, Anne Merethe; Ericson, Johanna U; Simonsen, Gunnar Skov; Bongo, Lars Ailo; Furberg, Anne-Sofie (Journal article; Tidsskriftartikkel; Peer reviewed, 2022-08-31)
      Objectives Staphylococcus aureus carriage increases infection risk. We used social network analysis to evaluate whether contacts have the same S. aureus genotype indicating direct transmission, or whether contagiousness is an indirect effect of contacts sharing the same lifestyle or characteristics. Methods The Fit Futures 1 study collected data on social contact among 1038 high school students. ...
    • Transparent Incremental Updates for Genomics Data Analysis Pipelines 

      Pedersen, Edvard; Willassen, Nils Peder; Bongo, Lars Ailo (Chapter; Bokkapittel, 2014)
      A large up-to-date compendium of integrated genomic data is often required for biological data analysis. The compendium can be tens of terabytes in size, and must often be frequently updated with new experimental or meta-data. Manual compendium update is cumbersome, requires a lot of unnecessary computation, and it may result in errors or inconsistencies in the compendium. We propose a transparent ...
    • Transparent Incremental Updates for Genomics Data Analysis Pipelines 

      Pedersen, Edvard; Willassen, Nils Peder; Bongo, Lars Ailo (Chapter; Bokkapittel, 2014)